EN FR
EN FR


Section: New Results

Machine learning for model acquisition

Participants : Thomas Guyet, René Quiniou.

Model acquisition is an important issue for model-based diagnosis, especially as modeling dynamic systems. We investigate machine learning methods for temporal data recorded by sensors or spatial data resulting from simulation processes. We also investigate efficient methods for storing and accessing large volume of simulations data. Our main interest is extracting knowledge, especially sequential and temporal patterns or prediction rules, from static or dynamic data (data streams). We are particularly interested in mining temporal patterns with numerical information and in incremental mining from sequences recorded by sensors.

Mining temporal patterns with numerical information

We are interested in mining interval-based temporal patterns from event sequences where each event is associated with a type and time interval. Temporal patterns are sets of constrained interval-based events. This year, we have been working on improving the formal setting of the approach as well as its efficiency [8] . We have introduced the notion of ϵ-covering of temporal patterns over sequences to cope with the dual nature, symbolic and numerical, of temporal patterns. The parameter ϵ specifies the tightness of the similarity used for matching patterns and sequences. It complements the parameter σ representing the minimal support which is used to prune candidate patterns. The ϵ-similar occurrences of some pattern, precisely their associated temporal intervals, are classified to characterize the different classes of numerical temporal intervals that correspond to different patterns sharing the same symbolic part. This process have been embedded in two sequential pattern mining algorithms, GSP and PrefixSpan, and we have compared their performance.

Incremental sequential mining

We investigate the problem of mining and maintaining frequent sequences in a window sliding on a stream of itemsets. We propose in [11] a complete and correct incremental algorithm based on a tree representation of frequent sequences inspired by PSP [52] and a method for counting the minimal occurrences of a sequence. Instead of the frequence, to a node representing a pattern is associated the set of occurrences of this pattern. The algorithm updates efficiently the tree representation of frequent sequences and their occurrences by means of two operations on the tree: deletion of the itemset at the beginning of the window (obsolete data) and addition of an itemset at the end of the window (new data). Experiments were conducted on simulated data and on real data of instantaneous power consumption.

Multiscale segmentation of satellite image time series

Satellite images allow the acquisition of large-scale ground vegetation. Images are available along several years with a high acquisition frequency (1 image every two weeks). Such data are called satellite image time series (SITS). In [9] , we present a method to segment an image through the characterization of the evolution of a vegetation index (NDVI) on two scales: annual and multi-year. We test this method to segment Senegal SITS and compare our method to a direct classification of time series. The results show that our method using two time scales better differentiates regions in the median zone of Senegal and locates fine interesting areas (cities, forests, agricultural areas).